30 research outputs found

    Challenges in speech processing of Slavic languages (case studies in speech recognition of Czech and

    Get PDF
    Abstract. Slavic languages pose a big challenge for researchers dealing with speech technology. They exhibit a large degree of inflection, namely declension of nouns, pronouns and adjectives, and conjugation of verbs. This has a large impact on the size of lexical inventories in these languages, and significantly complicates the design of text-to-speech and, in particular, speech-to-text systems. In the paper, we demonstrate some of the typical features of the Slavic languages and show how they can be handled in the development of practical speech processing systems. We present our solutions we applied in the design of voice dictation and broadcast speech transcription systems developed for Czech. Furthermore, we demonstrate how these systems can be converted to another similar Slavic language, in our case Slovak. All the presented systems operate in real time with very large vocabularies (350K words in Czech, 170K words in Slovak) and some of them have been already deployed in practice

    A cross-lingual adaptation approach for rapid development of speech recognizers for learning disabled users

    Get PDF
    Building a voice-operated system for learning disabled users is a difficult task that requires a considerable amount of time and effort. Due to the wide spectrum of disabilities and their different related phonopathies, most approaches available are targeted to a specific pathology. This may improve their accuracy for some users, but makes them unsuitable for others. In this paper, we present a cross-lingual approach to adapt a general-purpose modular speech recognizer for learning disabled people. The main advantage of this approach is that it allows rapid and cost-effective development by taking the already built speech recognition engine and its modules, and utilizing existing resources for standard speech in different languages for the recognition of the users’ atypical voices. Although the recognizers built with the proposed technique obtain lower accuracy rates than those trained for specific pathologies, they can be used by a wide population and developed more rapidly, which makes it possible to design various types of speech-based applications accessible to learning disabled users.This research was supported by the project ‘Favoreciendo la vida autónoma de discapacitados intelectuales con problemas de comunicación oral mediante interfaces personalizados de reconocimiento automático del habla’, financed by the Centre of Initiatives for Development Cooperation (Centro de Iniciativas de Cooperación al Desarrollo, CICODE), University of Granada, Spain. This research was supported by the Student Grant Scheme 2014 (SGS) at the Technical University of Liberec

    Feature selection methods for hidden Markov model-based speech recognition

    No full text
    In the paper three different feature selection methods applicable to speech recognition are presented and discussed. Widely known approaches, like the principal component analysis, discriminant feature analysis and sequential search methods, have been customised for the use with a hidden Markov model based classifier. When comparing the methods we focus mainly on their ability to reduce the size of the feature vectors standardly used in speech processing. It is demonstrated that the sequential methods and the discriminative analysis are well suited for that task. Both of them may contribute to a recognition time reduction by a factor higher than two without a significant loss of accuracy, particularly, in the combination with a two-level classification scheme. © 1996 IEEE

    Úvodní kurs počítačového zpracování řeči pro studenty bakalářského studia

    No full text
    The article presents concept of a one-semester course, prepared and taught by the author during his stay at ETH in Zurich in 2006

    A two-level classification scheme for CDHMM-based discrete-utterance recognition

    No full text
    In the paper a method or speeding up the response of a CDHMM based speech recognition system is introduced. The method, applicable for the recognition of discrete utterances, uses a two-level classification scheme. It consists in a fast match done with simplified models, followed by a final accurate match with a limited number of selected standard models. In this way the recognition time can be reduced by great deal without any significant loss of recognition accuracy. The method has been successfully applied in the design of real-time speech recognition systems operating with small and middle-size vocabularies

    Discrete-utterance recognition with a fast match based on total data reduction

    No full text
    In the paper, a two-level classification scheme applicable to practical discrete-utterance recognition systems is presented. Both the fast and fine match employ CDHMM whole-word models. The fast match is based on total data reduction, which includes both the minimalization of the acoustic data flow (the numbers of speech frames and features) and the reduction of the basic HMM parameters (the numbers of states and mixtures). The optimal choice of the fast match parameters is a subject of the procedure that aims at minimizing the total classification time while preserving the maximum available recognition accuracy. On a medium-size vocabulary task (121 city names) the fast match reduced recognition time to approx. 20% (compared with the original one-level system) with a negligible loss of accuracy. The time savings were even more considerable in case of a system with multi-mixture HMMs

    Lze použít automatické rozpoznávání řeči k hodnocení kvality řeči?

    No full text
    In the contribution several case studies are presented in which automatic speech recognition was tested as a means for evaluating of speech quality, either human or synthetic. Usually, speech quality is measured by subjective listening tests. Our aim is to investigate, whether these tests, which request considerable amount of human time and experience, could be replaced or supplemented by techniques based on ASR

    Unified Approach to Development of ASR Systems for East Slavic Languages

    No full text

    Methods for Rapid Development of Automatic Speech Recognition System for Russian

    No full text
    corecore